Systems, methods, and computer program products for compression, digital watermarking, and other digital signal processing for audio and/or video applications

ABSTRACT

Systems, methods, and computer program products are provided for digital signal processing, which may include obtaining a first set of digitized coefficients from source data and determining a best-fit distribution of a generalized Gaussian distribution for the set of digitized coefficients. The digital signal processing may further include applying a quantization algorithm to the first set of plurality of digitized coefficients to obtain a second set of quantizers, wherein the quantization algorithm is based at least in part on the determined best-fit distribution, and providing second set of quantizers as a compressed representation of the source data. The digital signal processing may also include retrieving suspected encoded data, determining at least one parameter of a generalized Gaussian distribution for the suspected encoded data, determining a digital watermark within the suspected encoded data based at least in part on the determined parameter, and extracting the digital watermark from the suspected encoded data.

RELATED APPLICATIONS

The present invention claims benefit of U.S. Provisional ApplicationSer. No. 60/748,967, filed Dec. 9, 2005, and entitled “Systems, Methods,and Computer Program Products for Determining Parameters of aGeneralized Gaussian Distribution,” which is hereby incorporated byreference in its entirety as if fully set forth herein.

FIELD OF THE INVENTION

Aspects of an embodiment of the invention relate generally to digitalsignal processing, and more particularly to compression, digitalwatermarking, and other digital signal processing for audio and/or videoapplications.

BACKGROUND OF THE INVENTION

The continued growth of Internet has resulted in the increasing use ofdigital media, which includes audio, video, and a combination thereof.However, limited bandwidth and transmission speeds over the Internetstill make the real-time transmission of digital media difficult sincethe source feed may be of high resolution, of a large size, and/or of anessentially random nature, thereby making the source feed difficult tocompress instantaneously for real-time transmission.

Accordingly, there is a need in the industry for systems, methods, andcomputer program products that facilitate in the compression andtransmission of digital media. Additionally, there is a need in theindustry for reliability encoding and decoding invisible digitalwatermarks within the digital media for determining unauthorizeddistribution of such digital media.

SUMMARY OF THE INVENTION

According to an embodiment of the present invention, there is acomputer-implemented method for performing digital signal processing.The computer-implemented method includes obtaining a first set ofdigitized coefficients from source data and determining a best-fitdistribution of a generalized Gaussian distribution for the set ofdigitized coefficients. The computer-implemented method further includesapplying a quantization algorithm to the first set of plurality ofdigitized coefficients to obtain a second set of quantizers, where thequantization algorithm is based at least in part on the determinedbest-fit distribution, and providing second set of quantizers as acompressed representation of the source data.

According to another embodiment of the present invention, there is asystem for performing digital signal processing, The system includes amemory for storing executable instructions and a processor incommunication with the memory. The processor is operable to execute thestored instructions to obtain a first set of digitized coefficients fromsource data and determine a best-fit distribution of a generalizedGaussian distribution for the set of digitized coefficients. Theprocessor is further operable to apply a quantization algorithm to thefirst set of plurality of digitized coefficients to obtain a second setof quantizers, where the quantization algorithm is based at least inpart on the determined best-fit distribution, and provide second set ofquantizers as a compressed representation of the source data.

According to still another embodiment of the present invention, there isa computer-implemented method for performing digital signal processing.The computer-implemented method includes retrieving suspected encodeddata and determining at least one parameter of a generalized Gaussiandistribution for the suspected encoded data. The computer-implementedmethod includes determining a digital watermark within the suspectedencoded data based at least in part on at least one the determinedparameter and extracting the digital watermark from the suspectedencoded data.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWING(S)

Having thus described the invention in general terms, reference will nowbe made to the accompanying drawings, which are not necessarily drawn toscale, and wherein:

FIG. 1 is an overview of a digital signal processing system, accordingto an embodiment of the present invention.

FIG. 2 illustrates an exemplary flow diagram for the compression ofsource data, according to an exemplary embodiment of the presentinvention.

FIG. 3 illustrates an exemplary flow diagram for the decompression ofsource data, according to an exemplary embodiment of the presentinvention.

FIG. 4 illustrates an exemplary flow diagram for encoding digitized datawith a digital watermark, according to an embodiment of the presentinvention.

FIG. 5 illustrates an exemplary flow diagram for verifying whether adigital watermark is embedded within digitized data, according to anexemplary embodiment of the present invention.

FIG. 6 illustrates exemplary plots of Z(θ) as a function of θcorresponding to four different values of the true parameter θ₀,according to an exemplary embodiment of the present invention.

FIG. 7 illustrates exemplary plots of Z_(n)(θ) as a function of θ basedon a random sample of size n, according to an exemplary embodiment ofthe present invention.

FIGS. 8 and 9 illustrate global convergence of the Newton-Raphsonalgorithm in accordance with an exemplary embodiment of the presentinvention.

DETAILED DESCRIPTION OF THE INVENTION

The present inventions now will be described more fully hereinafter withreference to the accompanying drawings, in which some, but not allembodiments of the inventions are shown. Indeed, these inventions may beembodied in many different forms and should not be construed as limitedto the embodiments set forth herein; rather, these embodiments areprovided so that this disclosure will satisfy applicable legalrequirements. Like numbers refer to like elements throughout.

The present invention is described below with reference to blockdiagrams of systems, methods, apparatuses and computer program productsaccording to an embodiment of the invention. It will be understood thateach block of the block diagrams, and combinations of blocks in theblock diagrams, respectively, can be implemented by computer programinstructions. These computer program instructions may be loaded onto ageneral purpose computer, special purpose computer, or otherprogrammable data processing apparatus to produce a machine, such thatthe combination of computing hardware and instructions which executethereon constitute means for implementing the functionality of eachblock of the block diagrams, or combinations of blocks in the blockdiagrams discussed in detail in the descriptions below.

These computer program instructions may also be stored in acomputer-readable memory to constitute an article of manufacture. Thearticle of manufacture may be used in conjunction with a computingdevice to cause the instructions from the article of manufacture to beloaded onto and executed by the computing device, and thereby implementthe function specified in the block or blocks.

The computer program instructions may also be loaded onto a computer orother programmable data processing apparatus to cause a series ofoperational steps to be performed on the computer or other programmableapparatus to produce a computer implemented process such that theinstructions that execute on the computer or other programmableapparatus provide steps for implementing the functions specified in theblock or blocks.

Accordingly, blocks of the block diagrams support combinations of meansfor performing the specified functions, combinations of steps forperforming the specified functions and program instruction means forperforming the specified functions. It will also be understood that eachblock of the block diagrams, and combinations of blocks in the blockdiagrams, can be implemented by general or special purposehardware-based computer systems that perform the specified functions orsteps, or combinations of general or special purpose hardware andcomputer instructions.

The inventions may be implemented through one or more applicationprograms running on one or more operating systems of one or morecomputers. The inventions also may be practiced with diverse computersystem configurations, including hand-held devices, multiprocessorsystems, microprocessor based or programmable consumer electronics,mini-computers, mainframe computers, and the like.

Application programs that are components of the invention may includemodules, objects, data structures, etc., that perform certain tasks orimplement certain abstract data types. A particular application program(in whole or in part) may reside in a single or multiple memories.Likewise, a particular application program (in whole or in part) mayexecute on a single or multiple computers or computer processors.Exemplary embodiments of the present invention will hereinafter bedescribed with reference to the figures, in which like numerals indicatelike elements throughout the several drawings.

Embodiments of the present invention may provide for systems, methods,and computer program products that may provide signal processing and/ordigital watermarking for audio and video applications, including thedistribution (e.g., real-time distribution) of audio and video over theInternet and other mediums (e.g., DVD, CDs, flash drives, etc.). Indeed,as will be discussed in further detail below, embodiments of the presentinvention may provide exemplary compression methodologies for digitalmedia transmission and/or distribution. Further embodiments of thepresent invention, may provide for highly reliable digital watermarkingretrieval and verification methodologies for such digital media.

I. System Overview

FIG. 1 is an illustrative system overview of a signal processing system100 in accordance with an embodiment of the present invention. As shownin FIG. 1, the signal processing system 100 may include one or morecomputers 105 in communication with a control unit 110 via one or morenetworks 108. The network 108 may include a variety of wired andwireless networks, both private and public, including the Internet, alocal area network, a metro area network, a wide area network, a publicswitched telephone network, or any combination thereof.

Each computer 105 may include client software (e.g., applet, stand-alonesoftware, a web client, Internet portal) to interface with the controlunit 110. The control unit 110 may provide data to and/or receive datafrom the computer 105. According to an embodiment of the presentinvention, the control unit 110 may compress source data and distributethe compressed data to one or more computers 105. Likewise, the controlunit 110 may also be operative to embed digital watermarks within thedata to the distributed (e.g., images, audio, etc.) and/or retrieve andverify digital watermarks from such data, according to embodiments ofthe present invention.

Still referring to FIG. 1, the control unit 110 may include a memory 155that stores programmed logic 165 (e.g., software) in accordance with anembodiment of the present invention. The programmed logic 165 mayinclude one or more modules for executing compression, digitalwatermarking, and other signal processing methodologies, or combinationsthereof, in accordance with embodiments of the present invention. Thememory 155 may include data 170 that may be utilized in the operation ofthe present invention and an operating system 175. The data 170 mayinclude source data for distribution as well as received data forverification and extraction of a digital watermark. The data 170 mayfurther include parameters associated with one or more of thecompression, digital watermarking, and other signal processing methodsdescribed herein. A processor 177 may utilize the operating system 175to execute the programmed logic 165, and in doing so, may also utilize(e.g., store, modify, and/or retrieve) the data 170.

A data bus 180 may provide communication between the memory 155 and theprocessor 177. Users may interface with the control unit 110 via a userinterface device(s) 185 such as a keyboard, mouse, control panel,display, microphone, speaker, or any other devices capable ofcommunicating information to or from the control unit 110. The controlunit 110 may be in communication with other external devices via I/OInterface(s) 190. Additionally, the control unit 110 may include anetwork interface 195 for communication with the network 100, includingone or more computers 105. Further the control unit 110 and theprogrammed logic 165 implemented thereby may comprise software,hardware, firmware or any combination thereof. The control unit 110 maybe a personal computer, mainframe computer, minicomputer, any othercomputer device, or any combination thereof without departing fromembodiments of the present invention.

II. Operational Compression Methodology

The operation of the signal processing system 100 of FIG. 1 will now bediscussed in further detail with respect to FIGS. 2 and 3. FIG. 2illustrates an exemplary flow diagram for compression of the source datawhile FIG. 3 illustrates an exemplary flow diagram for the decompressionof the source data, according to an embodiment of the present invention.As block 202 of FIG. 2, a control unit 110 may receive source data fordistribution. According to an exemplary embodiment of the presentinvention, this source data may be in analog form and include audio,video, and a combination thereof, and the like. This source data mayalso be essentially real-time source data. In block 202, the controlunit 110 and in particular, the component programmed logic 165, mayexecute instructions for retrieving digitized coefficients (e.g.,transform coefficients) from the source data. These digitizedcoefficients (e.g., transform coefficients) may utilize frequencytransformation coefficients, including discrete cosine transform (DCT)coefficients, according to an embodiment of the present invention. Forexample, with DCT coefficients, the source data may be transformed fromthe time domain to the frequency domain. Other transform coefficientsmay include Fast Fourier transform (FFT) coefficients, Wavelet Transform(WT) coefficients, or Laplace transform coefficients. However, one ofordinary skill in the art will readily recognize that these digitizedcoefficients may be obtained via methods other than transforms. Forexample, the digitized coefficients may be obtained from directmeasurements in a spatial or time domain such as raw image pixelintensity values or sound intensity values.

The digitized coefficients, which are typically represented by sequencesof real numbers, retrieved from the source data of block 204 mayessentially form a random distribution. In block 206, a globallyconvergent method in accordance with an embodiment of the presentinvention may determine a best-fit distribution from a generalizedfamily of distributions. The generalized family of distributions may bea generalized Gaussian distribution, which will be described in furtherdetail below. The best-fit distribution may be determined based upon theparameters (e.g., shape parameter, scale parameter) determined for thegeneralized Gaussian distribution. According to an exemplary aspect ofthe present invention, the parameters may determine that the best-fitdistribution is a Gaussian Distribution, a Laplacian Distribution, orthe like.

In block 208, the transform coefficients determined in block 204 can becompressed and/or summarized using quantizers in accordance with aquantization algorithm (e.g., Lloyd-Max, uniform scalar dead-zonequantization, generalized uniform scalar dead-zone quantization, trelliscoded quantization, vector quantization, entropy constrained scalarquantization (for DWT/DCT quantization tables, etc.). As describedabove, transform coefficients may be a real number along a real line.Generally, such a real line may be partitioned using quantizers, whereeach quantizer represents a portion or range (e.g., quantization stepsize) of the real line. Accordingly, transform coefficients that fallwithin a range or portion of the real line may be assigned to thequantizer that represents the respective range or portion. Further, adead-zone may be indicated on the real line, perhaps from −δ to +δ, anda simplified quantizer (e.g., null or zero) may be provided forcoefficients that fall within the dead-zone. Therefore, the source datamay be compressed because each quantizer may be smaller in size than thetransform coefficient that it represents. Furthermore, the source datamay be compressed because the number of quantizer levels may be lessthan the number of digitized coefficients.

Still referring to block 208, in accordance with an embodiment of thepresent invention, a quantization algorithm (Lloyd-Max, uniform scalardead-zone quantization, generalized uniform scalar dead-zonequantization, trellis coded quantization, etc.) may be based at least inpart on a generalized Gaussian distribution. In such a case, embodimentsof the present invention may provide an efficient and globallyconvergent method for determining parameters for a generalized Gaussiandistribution, which may then be utilized by the quantization algorithmfor determining the quantizers, the quantization step size, and/or thedead-zone. More specifically, these parameters may include shape andscale parameters for the generalized Gaussian distribution associatedwith the transform coefficients (previously determined in block 206). Asshown in block 210, the quantizers determined in block 208 may then betransmitted to one or more of the computers 105 as a compressedrepresentation of the source signal.

FIG. 3 illustrates an exemplary flow diagram for the decompression ofthe source data, according to an exemplary embodiment of the presentinvention. In block 302, the computer 105 may receive the quantizersfrom the control unit 110. In addition to the quantizers, the computer105 may have also received decompression information, includinginformation associated with the type of quantization methodologyutilized and the best-fit distribution (e.g., information associatedwith the shape and scale parameters determined for the generalizedGaussian distribution) utilized previous in step 208. Likewise, thecomputer 105 may have also received information regarding thedigitization methodology of step 204. In block 304, the computer 105 mayreconstruct the digitized coefficients from the received quantizers.Having retrieved or reconstructed the digitized coefficients, thedigitized coefficients can then be decoded to obtain the source data, oran approximation thereof (block 306). As an example, DCT coefficientsmay be transformed from the frequency domain back to the time domain toobtain the source data. Many variations of the compression anddecompression methodology described above may be available withoutdeparting from embodiments of the present invention.

III. Operational Digital Watermarking Methodology

In accordance with an embodiment of the present invention, digitalwatermarks may be added or embedded as extra bits to digitized data(e.g., digitized coefficients), such as audio, video, and a combinationthereof. These extra bits (“the digital watermark”) may provideinformation regarding the origin of the digital data (e.g., author,place, date, and other identification information). The extra bitstypically do not change the digitized data to a perceptible extent,according to an embodiment of the present invention. FIG. 4 illustratesan exemplary flow diagram for encoding the digitized data with thedigital watermark, according to an exemplary embodiment of the presentinvention.

In FIG. 4, the digitized coefficients may be retrieved from source data(block 402). As described previously, these digitized coefficients maybe transform coefficients, including DCT, FFT, WT, or Laplacecoefficients. Likewise, these digitized coefficients may be obtained viamethods other than transforms. For example, the digitized coefficientsmay be obtained from direct measurements in a spatial or time domainsuch as raw image pixel intensity values or sound intensity values.These digitized coefficients may be obtained via methods other thantransforms, including from direct measurements in a spatial or timedomain such as raw image pixel intensity values or sound intensityvalues.

In block 404, the digital watermark (e.g., one or more groups of bits,etc.) may be embedded within the digitized coefficients to form encodeddigitized coefficients. According to an embodiment of the presentinvention, the encoded digitized coefficients y_(i) may be determined asy_(i)=x_(i)*(l+α*w_(i)) in accordance with an exemplary spread spectrumwatermarking, where x_(i) is the digitized coefficient (e.g., a DCTcoefficient, DWT, FFT, from independent component analysis, etc.) andw_(i) is the digital watermark information. Generally, α may be selectedby the encoder and may be a relatively small value so that the digitalwatermark information w_(i) will remain imperceptible to the user.

According to another embodiment of the present invention, otherwatermarking methods may be utilized, including an exemplaryquantization-based watermarking method. With such a quantization-basedwatermarking method, the a plurality of the digitized coefficients(e.g., DCT coefficients) may be quantized as described above and rounded(e.g., perhaps to the nearest integer). The rounding method may beindicative of the quantization-based watermarking method. For example,the quantization-based watermarking method may round a plurality ofquantized values, as necessary, to an even number or an odd number(e.g., encoded with a 0 or 1).

It will be appreciated that other watermarking methods may also beutilized in accordance with alternative embodiments of the presentinvention, including quantization index modulation and hybridwatermarking methods. However, for these methods to be effective, it maybe necessary to accurately model the distributions of the digitizedcoefficients (e.g., DCT, DWT, FFT, Fourier-Mellin transform,coefficients generated by independent component analysis (ICA), and thelike), thereby allowing for more precise quantization tables to be usedwithin these methods. Many other variations of the watermarking methodsare available.

After the digital watermark has been embedded within the digitizedcoefficients (block 404), then the encoded digitized data may betransmitted or otherwise distributed. (block 406). Further, informationregarding the encoding (e.g., the digitization method, the digitalwatermarking method, etc.) may also be transmitted or otherwisedistributed. Although not explicitly illustrated as such, the encodeddigitized data may be compressed prior to transmission or distribution.Such compression may be provided as described herein in accordance withan exemplary embodiment of the present invention.

FIG. 5 illustrates a flow diagram for verifying whether digitalwatermark is embedded within digitized data, according to an exemplaryembodiment of the present invention. In FIG. 5, the suspected encodeddata is obtained for processing (block 500). In block 502, the watermarkmay be detected within the suspected encoded data in accordance with anembodiment of the present invention. For example, the detection may relyupon one or more test statistics for the suspected encoded data, whichmay include the maximum likelihood-based test statistics or Bayesianstatistics. In particular, the test statistics may be evaluated asexceeding a given threshold value based upon the parameters determinedfor a generalized Gaussian distribution associated with the suspectedencoded data. Alternatively, the probability of the test statisticexceeding a given threshold value may be determined based upon theparameters determined for a generalized Gaussian distribution associatedwith the suspected encoded data. Accordingly, the evaluation orcomputation of the probability of the test statistic exceeding a giventhreshold value may be indicative of whether a watermark has beendetected (block 504).

If the evaluation or computation the probability of the test statisticexceeding a given threshold is sufficiently small (e.g., therebyrejecting the null hypothesis that there is no digital watermark) (block504), then the suspected digital watermark may be detected and extracted(block 506). Otherwise, the suspected encoded data may be determined notto include a digital watermark (block 508).

Returning back to block 506, the suspected digital watermark may beextracted. This extraction may be based upon either a non-blindextraction or a blind extraction in accordance with an embodiment of thepresent invention. First, with a non-blind extraction, the originalencoded data and the digital watermarking method may be known. Using theoriginal encoded signal and the suspected encoded signal (and perhapsthe digital watermarking method), an extraction may be based upon adifference between the original embedded data and the suspected embeddeddata, according to an exemplary embodiment of the present invention.Second, according to a blind extraction, the original encoded data maynot be necessary for the extraction of the digital watermark. Asdescribed above, one of the digital watermarking methods may round aplurality of quantized values to either an odd or even number.Accordingly, by examining the suspected encoded data, if a significantnumber of the expected plurality of coefficients in the suspectedencoded data are either odd or even, then the digital watermark may bedecoded using the appropriate odd or even system (e.g., using the 0 or 1decoding).

IV. Methodology for Determining the Best-Fit Distribution of GeneralizedGaussian Distribution

The determination of the best-fit distribution for the transformcoefficients, as introduced in block 206 of FIG. 2, will now bediscussed in further detail below. Generally, the methodology mayprovide a globally-convergent method for determining the parameters of ageneralized Gaussian distribution family, according to an embodiment ofthe present invention. These parameters of the generalized Gaussiandistribution may be indicative of the type of best-fit distribution fromthe generalized Gaussian distribution family. For example, theparameters may indicate a Gaussian or Laplace distribution.

In accordance with an embodiment of the present invention, thetwo-parameters (shape parameter θ and scale parameter σ) of thegeneralized Gaussian distribution may be utilized in digital signalprocessing (e.g., steps 206, 208 of FIG. 2). In particular, thegeneralized Gaussian distribution (GGD) has a probability densityfunction that may be specified by:

$\begin{matrix}{{{f\left( {x,\sigma,\theta} \right)} = {\frac{\theta}{2{{\sigma\Gamma}\left( {1/\theta} \right)}}{\mathbb{e}}^{- {\frac{x}{\sigma}}^{\theta}}}},} & (1)\end{matrix}$where θ represents the shape parameter and σ represents the scaleparameter (collectively referred to as parameters of the GGD).

In accordance with an embodiment of the present invention, the shapeparameter θ (and thus the scale parameter σ) of the generalized Gaussiandistribution may be estimated with very high accuracy using acomputationally-efficient and globally-convergent methodology. Morespecifically, an estimated shape parameter is provided by the root ofthe following shape parameter estimation equation:

${{Z_{n}(\theta)} = {{\frac{\frac{1}{n}{\sum\limits_{i = 1}^{n}{X_{i}}^{2\theta}}}{\left( {\frac{1}{n}{\sum\limits_{i = 1}^{n}{X_{i}}^{\theta}}} \right)^{2}} - \left( {\theta + 1} \right)} = 0}},$where X_(l) . . . X_(n) represent the digitized coefficients.

Given any starting value of the shape parameter θ, the Newton-Raphsonroot-finding iteration method, as well known in the art, may be appliedto the estimated shape equation Z_(n)(θ)=0 to find the actual value ofthe shape parameter θ. The Newton-Raphson root-finding iteration methodfor the above estimated shape equation Z_(n)(θ)=0 will always convergeto a unique global root representing the actual shape parameter θ.

According to an another embodiment of the present invention, atransformation, which may be a logarithmic transformation, of theestimated shape equation Z_(n)(θ)=0 may be utilized determine the actualvalue of the shape parameter θ. If a logarithmic transformation

$\left( {{e.g.},\mspace{11mu}{{\log\left( {\frac{1}{n}{\sum\limits_{i = 1}^{n}{X_{i}}^{2\theta}}} \right)} - {\log\left( \left( {\frac{1}{n}{\sum\limits_{{i =}\rbrack}^{n}{X_{j}}^{\theta}}} \right)^{2} \right)} - {\log\left( {\theta + 1} \right)}}} \right)$is utilized, then the algorithm may converge to a unique global root infewer iterations. Further, Halley's method may be utilized forZ_(n)(θ)=0 or the logarithmic transformation of Z_(n)(θ)=0, which mayalways converge to a unique global root representing the actual shapeparameter θ. In accordance with an embodiment of the present invention,the use of Halley's method may provide global convergence to a uniqueglobal root (e.g., actual shape parameter θ) in significantly fewer(e.g., approximately half as many) iterations. It will be appreciatedthat other transformations and root-solving methods may be utilized forthe estimated shape equation Z_(n)(θ)=0 without departing fromembodiments of the present invention.

Moreover, it will be appreciated that other parameterizations of theGGD, including those associated with JPEG compression (e.g., JPEG-2000),may be available without departing from embodiments of the presentinvention. For example, according to an alternative embodiment, the GGDprobability function p(x) may be specified instead by:

${{p(x)} = {\frac{\alpha}{2{{\sigma\Gamma}\left( {1/\alpha} \right)}}\sqrt{\frac{\Gamma\left( {3/\alpha} \right)}{\Gamma\left( {1/\alpha} \right)}}{\exp\left( {- \left( {\sqrt{\frac{\Gamma\left( {3/\alpha} \right)}{\Gamma\left( {1/\alpha} \right)}}\;\left( \frac{x}{\sigma} \right)} \right)^{\alpha}} \right)}}},$where α is similar to the shape parameter θ described above and

${\sigma\left( \sqrt{\frac{\Gamma\left( {1/\alpha} \right)}{\Gamma\left( {3/\alpha} \right)}} \right)}^{1/\alpha}$is similar to the scale parameter σ described above. Otherparameterizations of the GGD are available without departing fromembodiments of the present invention,

The following discussion illustrates the global convergence of the shapeestimator equation when applied to the generalized Gaussian distributionfamily.

A. Shape Parameter Estimator Estimation Equation

To motivate this method, we denote the true value of the parametervector (σ, θ) by (σ₀, θ₀), and suppose the random variable X isdistributed according to the GGD with the parameter vector (σ₀, θ₀).Here, expectations and variances computed with respect to the truedistribution are denoted by E₀ and Var₀ respectively. Any probabilitystatement made with respect to this distribution is denoted by P₀. LetY=|X/σ₀|^(θis 0). It follows from a simple calculation that Y has thegamma distribution with the PDF g(y; θ₀)=y^(1/θ) ⁰ ⁻¹e^(−y)/Γ(1/θ₀).Hence, the first two moments of Y are given by EY=1/θ₀ and EY²=(1+θ₀)/θ₀². Thus

$\begin{matrix}{\frac{E_{0}{\frac{X}{\sigma_{0}}}^{2\theta_{0}}}{\left( {E_{0}{\frac{X}{\sigma_{0}}}^{\theta_{0}}} \right)^{2}} = {\theta_{0} + 1}} & (2)\end{matrix}$Equation (2) is trivially equivalent to the equation

$\frac{E_{0}{X}^{2\theta_{0}}}{\left( {E_{0}{X}^{\theta_{0}}} \right)^{2}} = {\theta_{0} + 1}$This fact leads to the consideration of the following simple function ofthe shape parameter θ:

$\begin{matrix}{{Z(\theta)} = {\frac{E_{0}{X}^{2\theta}}{\left( {E_{0}{X}^{\theta}} \right)^{2}} - \left( {\theta + 1} \right)}} & (3)\end{matrix}$

The idea is to consider Z as a function of the shape parameter θ andfind a root of the shape equation Z(θ)=0. In Section B, it will beestablished that the shape equation has, on the positive real line, aunique global root which is equal to the true parameter value θ₀. Theunique global root is illustrated in FIG. 6, which shows plots of Z(θ)as a function of θ corresponding to four different values of the trueparameter θ₀. Still referring to FIG. 6, in (a), θ₀=2; in (b), θ₀=1.5;in (c), θ₀=1; and in (d) θ₀=0.5.

In addition, it will be shown that the Newton-Raphson root-findingalgorithm constructed by functional iteration of the shape equationconverges globally to the unique root from any starting point in thesemi-infinite interval Θ_(min). Since the expectation in the expressionof Z is taken with respect to the true parameter value θ₀, which isunknown in practice, we estimate the shape function Z(θ) by the sampleestimator Z_(n)(θ) based on a random sample {X_(i)}_(i=1) ^(n) from theGGD(σ₀, θ₀), where Z_(n)(θ) is defined according to

$\begin{matrix}{{Z_{n}(\theta)} = {\frac{\frac{1}{n}{\sum\limits_{i = 1}^{n}{X_{i}}^{2\theta}}}{\left( {\frac{1}{n}{\sum\limits_{i = 1}^{n}{X_{i}}^{\theta}}} \right)^{2}} - \left( {\theta + 1} \right)}} & (4)\end{matrix}$

We note that it follows from the law of large numbers and the continuousmapping theorem that Z_(n)(θ) is a consistent estimator of Z(θ). Toobtain an estimator of θ₀, we solve the sample-based shape estimatingequation Z_(n)(θ)=0 for a root to be used as the estimator of θ₀. For agiven positive integer value of the shape parameter θ, it might betempting, in light of equations (3) and (4), to think of the proposedestimator in the context of moment-matching (MM). However, one of thefundamental differences between our formulation and moment-matchingmethod is that the exact order of the moments is unknown and thisexponent to which the random variable is raised is precisely what we aretrying to estimate. Since the true unknown exponent can be any positivereal number not just integers, it may be more insightful to view ourproposed estimator in the general framework of approximating the highlynonlinear yet convex transformation (3) by the convex estimatingequation (4). This viewpoint is, in fact, supported by our numericalexperiments showing that the MM method performs poorly for heavy-taileddistributions even for large sample sizes due to the nonrobustness ofthe sample moment (sample mean and variance) to extreme observations. InSection B, we show that the sample shape estimating equation Z_(n)(θ)=0has a unique global root with probability tending to one and theresulting root converges in probability to the true shape parametervalue θ₀. The unique global root of the sample-based shape estimatingequation is illustrated in FIG. 7 which shows plots of the sampleestimator Z_(n)(θ) as a function of θ based on a single random samplegenerated from the GGD distribution with scale parameter σ₀=1 and withthe same values of θ₀ as given in FIG. 6. In particular, FIG. 7illustrates exemplary plots of Z_(n)(θ) as a function of θ based on arandom sample of size n=500 generated from the GGD with a scaleparameter σ₀=1 and the true shape parameter θ₀ corresponding to fourdifferent values: (a) θ₀=2, (b) θ₀=1.5, (c) θ₀=1, and (d) θ₀=0.5.

We also prove that, with probability tending to one, the Newton-Raphsonroot-finding iteration converges to the unique global root of the sampleshape estimating equation from any starting point on the domain Θ_(min).We conclude this section by noting that plugging the shape estimator ofθ₀ into the formula

${\sigma_{0} = \left( {\frac{\theta_{0}}{n}{\sum\limits_{i = 1}^{n}{X_{i}}^{\theta_{0}}}} \right)^{\frac{1}{\theta_{0}}}},$the resulting estimator is shown to be consistent for estimating σ₀.

B. Main Results

We begin this section with some notations. Let Θ denote the positivereal line: Θ=(0,∞). Let the maps Y, U, and V be defined as

-   -   Y: θ        E₀|X|^(θ)    -   U: θ        E₀|X|^(θ)log|X|    -   V: θ        E₀|X|^(θ)log²|X|

Similarly we define the sample analogue of these maps as follows:

$\left. {{\overset{\_}{Y}}_{n}\text{:}\mspace{20mu}\theta}\mapsto{\frac{1}{n}{\sum\limits_{i = 1}^{n}{X_{i}}^{\theta}}} \right.$$\left. {{\overset{\_}{U}}_{n}\text{:}\mspace{14mu}\theta}\mapsto{\frac{1}{n}{\sum\limits_{i = 1}^{n}{{X_{i}}^{\theta}\log{X_{i}}}}} \right.$$\left. {{\overset{\_}{V}}_{n}\text{:}\mspace{14mu}\theta}\mapsto{\frac{1}{n}{\sum\limits_{i = 1}^{n}{{X_{i}}^{\theta}\log^{2}{X_{i}}}}} \right.$

Finally let the function Ω be defined as

$\begin{matrix}{{\Omega(\theta)} = \frac{{\Gamma\left( \frac{{2\theta} + 1}{\theta_{0}} \right)}{\Gamma\left( \frac{1}{\theta_{0}} \right)}}{\Gamma^{2}\left( \frac{\theta + 1}{\theta_{0}} \right)}} & (5)\end{matrix}$

With these notations, we can rewrite (3) as

$\begin{matrix}{{Z(\theta)} = {\frac{Y\left( {2\theta} \right)}{Y^{2}(\theta)} - \left( {\theta + 1} \right)}} & (6)\end{matrix}$

Z_(n)(θ) in (4) can be rewritten as

$\begin{matrix}{{Z_{n}(\theta)} = {\frac{{\overset{\_}{Y}}_{n}\left( {2\theta} \right)}{{\overset{\_}{Y}}_{n}^{2}(\theta)} - \left( {\theta + 1} \right)}} & (7)\end{matrix}$

i. Existence and Uniqueness of a Global Root

We are now ready to state the first theorem.

Theorem 1: Let Z(′) be defined as in (3). Then the equation Z(θ)=0 for θε Θ has a unique global root in Θ at the true parameter value θ₀.

Proof: It follows from direct calculations that

$\begin{matrix}{{Y(\theta)} = {{\frac{\sigma_{0}^{\theta}{\Gamma\left( \frac{\theta + 1}{\theta_{0}} \right)}}{\Gamma\left( \frac{1}{\theta_{0}} \right)}\mspace{14mu}{and}\mspace{14mu}{Y\left( {2\theta} \right)}} = {\frac{\sigma_{0}^{2\theta}{\Gamma\left( \frac{{3\theta} + 1}{\theta_{0}} \right)}}{\Gamma\left( \frac{1}{\theta_{0}} \right)}.}}} & \;\end{matrix}$Therefore by (5) and (6), we have Z(θ)=Ω(θ)−(θ+1).

We observe that Z(θ₀)=0. Hence θ₀ is a root of Z(θ)=0. We nextdemonstrate that θ₀ is the only global root of Z(θ)=0. To this end, weshall show that Z(θ) is a strictly convex function of θ in Θ. Let ψ₀ andψ₁ denote the digamma and trigamma functions, respectively. Let thefunction Δ be defined as

$\begin{matrix}{{\Delta\;(\theta)} = {{\psi_{0}\left( \frac{{2\theta} + 1}{\theta_{0}} \right)} - {\psi_{0}\left( \frac{\theta + 1}{\theta_{0}} \right)}}} & (8)\end{matrix}$

Let {dot over (Δ)}, {dot over (Ω)}, and Ż denote their respectivederivatives of Δ, Ω and Z with respect to θ. Then some extensivecomputations show that

$\begin{matrix}{{{\overset{.}{\Omega}(\theta)} = \frac{2{\Omega(\theta)}{\Delta(\theta)}}{\theta_{0}}}{{\overset{.}{Z}(\theta)} = {\frac{2{\Omega(\theta)}{\Delta(\theta)}}{\theta_{0}} - 1}}{and}} & (9) \\\begin{matrix}{{\overset{..}{Z}(\theta)} = {\frac{2}{\theta_{0}}\left\lbrack {{{\overset{.}{\Omega}(\theta)}{\Delta(\theta)}} + {{\Omega(\theta)}{\overset{.}{\Delta}(\theta)}}} \right\rbrack}} \\{{= {\frac{2{\Omega(\theta)}}{\theta_{0}^{2}}\left\lbrack {{2{\Delta^{2}(\theta)}} + {\theta_{0}{\overset{.}{\Delta}(\theta)}}} \right\rbrack}},}\end{matrix} & (10)\end{matrix}$where {umlaut over (Z)} denotes the second derivative of Z with respectto θ. We note that ψ₀ and ψ₁ have the following series expansions:

${{\psi_{0}(x)} = {{- \gamma} + {\sum\limits_{k = 1}^{\infty}\frac{\left( {x - 1} \right)}{k\left( {k + x - 1} \right)}}}},$where γ is the Euler constant and

${\psi_{1}(x)} = {\sum\limits_{k = 1}^{\infty}{\frac{1}{\left( {k + x - 1} \right)^{2}}.}}$

It follows from these expansions of digamma and trigamma functions that(8) can be rewritten as

${{\Delta(\theta)} = {\frac{\theta}{\theta_{0}}{\sum\limits_{k = 1}^{\infty}\frac{1}{{a_{k}(\theta)}{a_{k}\left( {2\theta} \right)}}}}},{where}$${a_{k}(\theta)} = {k + \frac{\theta + 1}{\theta_{0}} - 1.}$ With${{a_{k}(0)} = {K + \frac{1}{\theta_{0}} - 1}},{{we}\mspace{14mu}{obtain}}$$\begin{matrix}{{\theta_{0}{\overset{.}{\Delta}(\theta)}} = {{2{\psi_{1}\left( \frac{{2\theta} + 1}{\theta_{0}} \right)}} - {\psi_{1}\left( \frac{\theta + 1}{\theta_{0}} \right)}}} \\{= {\sum\limits_{k = 1}^{\infty}{\frac{{a_{k}^{2}(0)} - {2\left( \frac{\theta}{\theta_{0}} \right)^{2}}}{{a_{k}^{2}(\theta)}{a_{k}^{2}\left( {2\theta} \right)}}.}}}\end{matrix}$ Hence${{2{\Delta^{2}(\theta)}} + {\theta_{0}{\overset{.}{\Delta}(\theta)}}} = {{\sum\limits_{k = 1}^{\infty}\frac{a_{k}^{2}(0)}{{a_{k}^{2}(\theta)} \cdot {a_{k}^{2}\left( {2\theta} \right)}}} + {2\underset{k \neq l}{\sum\sum}\frac{\theta^{2}}{\theta_{0}^{2}{a_{k}(\theta)}{a_{k}\left( {2\theta} \right)}{a_{l}(\theta)}{a_{l}\left( {2\theta} \right)}}}}$

Since for any k≧1, α_(k)(θ)>0 for all θ ε Θ and Ω(θ)>0 for all θ ε Θ, weconclude from (10) that {umlaut over (Z)}(θ)>0 for all θ ε Θ, that is,Z(θ) is a strictly convex function of θ. The strict convexity of Z(θ)implies that Z(θ)=0 has at most two distinct roots in Θ. We now claimthat the question Z(θ)=0 has a unique root at θ=θ₀. Indeed, there are anumber of ways to prove this claim. One approach is to assume that thereis another root {tilde over (θ)}₀ ε Θ such that {tilde over (θ)}₀≠θ₀. If{tilde over (θ)}₀>θ₀, then

$0 = {{{Z\left( {\overset{\sim}{\theta}}_{0} \right)} - {Z\left( \theta_{0} \right)}} = {{\left( {{\overset{\sim}{\theta}}_{0} - \theta_{0}} \right){\overset{.}{Z}\left( \theta_{0} \right)}} + {\frac{1}{2}\left( {{\overset{\sim}{\theta}}_{0} - \theta_{0}} \right)^{2}{\overset{..}{Z}\left( \theta_{0}^{*} \right)}}}}$for some θ₀ between θ₀ and {tilde over (θ)}₀. It follows from directcomputations that Δ(θ₀)=θ₀/(θ₀+1) and Ω(θ₀)=θ₀+1. Thus, by (9), we haveŻ(θ₀)=1. This together with the fact that {umlaut over (Z)}(θ₀*)>0 leadsto a contradiction. If 0<{tilde over (θ)}₀<θ₀, since Z is continuous on[{tilde over (θ)}₀, θ₀] and is differentiable at each point of ({tildeover (θ)}₀, θ₀), the mean value theorem implies that there exists a realnumber d ε({tilde over (θ)}₀, θ₀) such that

${\overset{.}{Z}(d)} = {\frac{{Z\left( \theta_{0} \right)} - {Z\left( {\overset{\sim}{\theta}}_{0} \right)}}{\theta_{0} - {\overset{\sim}{\theta}}_{0}} = 0.}$By the same reasoning, we note that Z is differentiable at each point of(0, {tilde over (θ)}₀) and

${{\lim\limits_{\theta\rightarrow 0^{+}}{Z(\theta)}} = 0},$there exists a real number c ε(0, {tilde over (θ)}₀) such that Ż(c)=0.Thus, we obtain that c<d, but Ż(c)=Ż(d)=0. This is a contradiction tothe observation that Ż is a strictly increasing function of θ in view ofthe fact that {umlaut over (Z)}(θ)>0 for all θ ε Θ.

ii. Global Convergence of Newton-Raphson Algorithm for Shape Equation

Theorem 1 provides the theoretical guarantee that the unique global rootof the equation Z(θ)=0 is equal to the true parameter value θ₀. Anatural question would be how to find the root of the equation Z(θ)=0.There are many root-finding procedures that can be used for thispurpose. Here we focus on the illustrative Newton-Raphson method due toits quadratic convergence property. Similar theorems can be proved forother root-finding procedures such as secant method. The Newton-Raphsonalgorithm for finding the unique global root of the equation Z(θ)=0 isgiven as follows:

$\begin{matrix}{\theta_{k + 1} = {\theta_{k} - \frac{Z\left( \theta_{k} \right)}{\overset{.}{Z}\left( \theta_{k} \right)}}} & (11)\end{matrix}$for k=1, 2, . . . , where θ₁ is the starting value for the iteration.Our next theorem says that from any starting value θ₁ in a semi-infiniteinterval Θ_(min), the Newton-Raphson algorithm (11) converges. Thisglobal convergence is in contrast with the generally poor globalconvergence properties of Newton-Raphson method.

Theorem 2: The Newton-Raphson functional iteration algorithm (11)converges to the true parameter value θ₀ from any starting point θ_(l) εΘ_(min), where the semi-infinite interval Θ_(min) is defined asΘ_(min)=(θ_(min), ∞) and

$\theta_{\min} = {\underset{\theta \in \Theta}{\arg\;\min}\;{{Z(\theta)}.}}$

In other words, θ_(k)→θ₀ as k→∞.

Proof. First we establish the existence and uniqueness of the minimizer

$\theta_{\min} = {\underset{\theta \in \Theta}{\arg\;\min}\;{Z(\theta)}}$so that argmin notation is well defined. It follows from the strictconvexity of Z that the minimization problem of minimizing Z(θ) subjectto θ ε Θ has at most one solution (if any). From previous computationsgiven in the proof of Theorem 1, we have Ż(θ₀)=1.

Evaluating the right hand limit of the gradient function Ż(θ₀) as θtends to zero, we obtain that

${\lim\limits_{\theta\rightarrow 0^{+}}{\overset{.}{Z}(\theta)}} = {- 1.}$Thus by the continuity of Ż and the intermediate value theorem, thereexists a real number θ_(min) ε (0, θ₀) such that Ż(θ_(min))=0. Since Żis strictly increasing on Θ, we obtain Ż(θ)>0 for θ>θ_(min) and Ż(θ)<0for θ<θ_(min). Thus the convex minimization problem of minimizing Z(θ)subject to θ ε Θ has a unique solution at the minimizer θ_(min). To showthe convergence of the iterates

$\theta_{k + 1} = {\theta_{k} - \frac{Z\left( \theta_{k} \right)}{\overset{.}{Z}\left( \theta_{k} \right)}}$to the true value θ₀ from any starting point θ₁ ε Θ_(min), we considertwo situations θ₁ ε (θ_(min), θ₀) and θ₁ ε [θ₀, ∞) separately. If θ₁ ε(θ_(min), θ₀), we demonstrate that only one step Newton-Raphsoniteration starting from θ₁ would move the iterate θ₂ into the interval[θ₀, ∞). To see this, we observe that Z(θ₁)<0 and Ż(θ)>0. It followsfrom the strict convexity and differentiability of Z that

${{\overset{.}{Z}\left( \theta_{1} \right)} < \frac{{Z\left( \theta_{0} \right)} - {Z\left( \theta_{1} \right)}}{\theta_{0} - \theta_{1}}} = \frac{- {Z\left( \theta_{1} \right)}}{\theta_{0} - \theta_{1}}$

This is equivalent to

${{\theta_{1} - \frac{Z\left( \theta_{1} \right)}{\overset{.}{Z}\left( \theta_{1} \right)}} > \theta_{0}},$that is, θ₂>θ₀. Thus it is enough for us to prove that the functionaliteration

$\theta_{k + 1} = {\theta_{k} - \frac{Z\left( \theta_{k} \right)}{\overset{.}{Z}\left( \theta_{k} \right)}}$converges to θ₀ from any starting point θ₁ ε [θ₀, ∞). To this end, letthe function g be defined by

${g(\theta)} = {\theta - \frac{Z(\theta)}{\overset{.}{Z}(\theta)}}$and the interval [θ₀, ∞) be denoted by I. We shall demonstrate that theequation θ=g(θ) has a unique fixed point θ₀ on the interval I and thefunctional iterates θ_(k+1)=g(θ_(k)) converge to θ₀ regardless of theirstarting point θ₁ ε I. It is clear that the function g is continuous onI and that g(θ₀)=θ₀ in view of the equation Z(θ₀)=0 so that θ₀ is afixed point of g. Furthermore, θ₀ is the only fixed point of g on I. Ifthis were not true, we have another fixed point θ₀*ε I with θ₀*≠θ₀ andθ₀*=g(θ₀*), this would be in contradiction with the fact that

${g\left( \theta_{0}^{*} \right)} = {{\theta_{0}^{*} - \frac{Z\left( \theta_{0}^{*} \right)}{\overset{.}{Z}\left( \theta_{0}^{*} \right)}} < {\theta_{0}^{*}.}}$What remains to be shown is that the functional iterates converge to θ₀.It follows from direct computations that

${\overset{.}{g}(\theta)} = {\frac{{Z(\theta)}{\overset{..}{Z}(\theta)}}{\left( {\overset{.}{Z}(\theta)} \right)^{2}}.}$Since {umlaut over (Z)}(θ)>0 and Z(θ)≧0 for all θ ε I, we conclude thatġ(θ)≧0 for all θ ε I (in fact, ġ(θ)>0 for all θ ε I\{θ₀}). Hence g(θ) isan increasing function of θ on I, which implies that θ₀=g(θ₀)≦g(θ)<∞ forall θ ε I. That is, the function g maps the interval I into itself. Inview of the fact that Ż(θ)>0 and, again, Z(θ)≧0 for all θ ε I, we have

$\theta_{k + 1} = {{g\left( \theta_{k} \right)} = {{\theta_{k} - \frac{Z\left( \theta_{k} \right)}{\overset{.}{Z}\left( \theta_{k} \right)}} \leq \theta_{k}}}$for θ_(k) ε I. This fact together with the observation thatθ_(k+1)=g(θ_(k))≧g(θ₀)=θ_(k for θ) _(k) ε I implies that the sequence{θ_(k)}_(k=1) ^(∞) is a monotonically decreasing sequence bounded below.Hence the limit lim_(k→∞)θ_(k) exists in I and this limit is denoted byθ_(∞). Invoking the continuity of the function g in the definingrelation θ_(k+1)=g(θ_(k)) shows that θ_(∞) is a fixed point of thefunction g in I. By the uniqueness of the fixed point of g on I, weconclude that θ_(∞)=θ₀, that is, lim_(k→∞)θ_(k)=θ₀, which completes theproof.

Some numerical examples are provided in FIG. 8 to demonstrate the globalconvergence of Newton-Raphson algorithm. From FIG. 8, we see thatstarting from any point in the semi-infinite interval Θ_(min), no matterhow far away the starting value θ₁ is from the true parameter θ₀,Newton-Raphson iterations always converge to θ₀. More specifically, inFIG. 8, global convergence of the Newton-Raphson algorithm isdemonstrated by plots of the trajectory of the functional iterationswith different starting values. The solid lines represent the shapeestimates θ_(k) plotted as functions of the iteration number k. Thedotted line denotes the true shape parameter θ₀ corresponding to fourdifferent values: (1) θ₀=2, (2) θ₀=1.5, (3) θ₀=1, and (d) θ₀=0.5.

iii. Existence of a Unique Global Root for the Sample-Based ShapeEstimating Equation

Both Theorem 1 and Theorem 2 are proved with preklowledge of θ₀. Inpractice they are not operational since θ₀ is unknown. In spite of this,they do provide valuable insights into obtaining analogous results forthe sample-based shape estimating equation. With these new insights, wecan now prove that with high probability, the sample estimating equationZ_(n)(θ₀)=0 has a unique global root on any arbitrarily large intervalΘ_(L)=(0, θ_(l)), where θ_(L)>θ₀ is any arbitrarily large real number.

Theorem 3: Let Z_(n) be defined as in (4). Then there exists a uniqueglobal root in Θ_(L) that solves the estimating equation Z_(n)(θ)=0 withprobability tending to 1. Furthermore, let this unique root be denotedby {circumflex over (θ)}_(n). Then {circumflex over (θ)}_(n)→θ₀ inprobability, as n→∞. In other words, {circumflex over (θ)}_(n) isconsistent for estimating θ₀.

Proof: Let G_(n)(θ)=n Y _(n)(2θ)−n(θ+1) Y _(n) ²(θ). We note from (7)that Z_(n)(θ) has the same roots (if any) as G_(n)(θ). DifferentiatingG_(n)(θ) with respect to θ and using the fact that

${\frac{\mathbb{d}{{\overset{\_}{Y}}_{n}(\theta)}}{\mathbb{d}\theta} = {{\overset{\_}{U}}_{n}(\theta)}},$we obtain

$\begin{matrix}{{{\overset{.}{G}}_{n}(\theta)} = {{2n\;{{\overset{\_}{U}}_{n}\left( {2\theta} \right)}} - {n\;{{\overset{\_}{Y}}_{n}^{2}(\theta)}} - {2{n\left( {\theta + 1} \right)}{{\overset{\_}{Y}}_{n}(\theta)}{{\overset{\_}{U}}_{n}(\theta)}}}} & (12)\end{matrix}$

It follows from the law of large numbers and the continuous mappingtheorem every θ ε Θ,

${{\overset{\_}{Y}}_{n}(\theta)}\overset{P_{0}}{\rightarrow}{Y(\theta)}$${{\overset{\_}{U}}_{n}(\theta)}\overset{P_{0}}{\rightarrow}{U(\theta)}$

Thus by (12), we have

$\begin{matrix}{{\frac{1}{n}{{\overset{.}{G}}_{n}(\theta)}}\overset{P_{0}}{\rightarrow}{\overset{.}{G}(\theta)}} & (13)\end{matrix}$

whereĠ(θ)=2U(2θ)−Y ²(θ)−2(θ+1)Y(θ)U(θ).

In particular, (13) implies that

${\frac{1}{n}{{\overset{.}{G}}_{n}\left( \theta_{0} \right)}}\overset{P_{0}}{\rightarrow}{\overset{.}{G}\left( \theta_{0} \right)}$

It follows from some computations that

$\begin{matrix}{{{U(\theta)} = {{Y(\theta)}\left\{ {{\log\;\sigma_{0}} + {\frac{1}{\theta_{0}}{\psi_{0}\left( \frac{\theta + 1}{\theta_{0}} \right)}}} \right\}}}{{\overset{.}{G}(\theta)} = {{Y^{2}(\theta)}\left\{ {\frac{2{\Omega(\theta)}{U\left( {2\theta} \right)}}{Y\left( {2\theta} \right)} - \frac{2\left( {\theta + 1} \right){U(\theta)}}{Y(\theta)} - 1} \right\}}}} & (14)\end{matrix}$

These together with expressions of Y(θ) and Ω(θ) imply thatG(θ₀)=Y²(θ₀)=σ₀ ^(2θ) ⁰ /θ₀ ², which is positive. For any real numberr>0, define the shrinking neighborhood of θ₀ by

$\begin{matrix}{{H_{n}\left( {\theta_{0},r} \right)} = \left\{ {\theta \in {{\Theta\text{:}\mspace{14mu}{{\theta - \theta_{0}}}} \leq \frac{r}{\sqrt{n\;{\overset{.}{G}\left( \theta_{0} \right)}}}}} \right\}} & (15)\end{matrix}$

Then

$\begin{matrix}{{\sup\limits_{\theta \in {H_{n}{({\theta_{0},r})}}}{{{\frac{1}{n}{{\overset{.}{G}}_{n}(\theta)}} - {\overset{.}{G}\left( \theta_{0} \right)}}}} \leq {{\sup\limits_{\theta \in {H_{n}{({\theta_{0},r})}}}{{{\frac{1}{n}{{\overset{.}{G}}_{n}(\theta)}} - {\overset{.}{G}(\theta)}}}} + {\sup\limits_{\theta \in {H_{n}{({\theta_{0},r})}}}{{{\overset{.}{G}(\theta)} - {\overset{.}{G}\left( \theta_{0} \right)}}}}}} & (16)\end{matrix}$We next show the uniform convergence of 1/nĠ_(n)(θ) to G(θ) inprobability uniformly for θ ranging over compacta in Θ. Directdifferentiation of Y _(n) ²(θ) with respect to θ leads to

$\frac{\mathbb{d}^{2}{Y_{n}^{2}(\theta)}}{\mathbb{d}\theta^{2}} = {{2\left\{ {{{\overset{\_}{U}}_{n}^{2}(\theta)} + {{{\overset{\_}{Y}}_{n}(\theta)}{{\overset{\_}{V}}_{n}(\theta)}}} \right\}} > 0}$for all θ ε Θ and all n, where we have used the relation that

$\frac{\mathbb{d}{{\overset{\_}{U}}_{n}(\theta)}}{\mathbb{d}\theta} = {{{\overset{\_}{V}}_{n}(\theta)}.}$Therefore Y _(n) ²(θ) is a sequence of strictly convex random functions.For every θ ε Θ, it follows from the law of large numbers and thecontinuous mapping theorem that

Thus we conclude by the convexity lemma that

for each compact subset K of Θ. By the same reasoning,

in probability.

Define W _(n)(θ) by

$\begin{matrix}{{{\overset{\_}{W}}_{n}(\theta)} = {{{\overset{\_}{U}}_{n}(\theta)} - {\frac{\theta^{2}}{2n}{\sum\limits_{i = 1}^{n}{\log^{3}{X_{i}}}}}}} & (17)\end{matrix}$

Differentiating W _(n)(θ) twice with respect to θ yields the followingexpression:

$\frac{\mathbb{d}^{2}{{\overset{\_}{W}}_{n}^{2}(\theta)}}{\mathbb{d}\theta^{2}} = {\frac{\mathbb{d}{{\overset{\_}{V}}_{n}(\theta)}}{\mathbb{d}\theta} - {\frac{1}{n}{\sum\limits_{i = 1}^{n}{\log^{3}{X_{i}}}}}}$Since

$\frac{\mathbb{d}^{2}{{\overset{\_}{V}}_{n}(\theta)}}{\mathbb{d}\theta^{2}} > 0$for all θ ε Θ and all n, it follows that

$\frac{\mathbb{d}{{\overset{\_}{V}}_{n}(\theta)}}{\mathbb{d}\theta} = {\frac{1}{n}{\sum\limits_{i = 1}^{n}{{X_{i}}^{\theta}\log^{3}{X_{i}}}}}$is a strictly increasing function of θ. Therefore

$\frac{\mathbb{d}{{\overset{\_}{V}}_{n}(\theta)}}{\mathbb{d}\theta} > {\frac{1}{n}{\sum\limits_{i = 1}^{n}{\log^{3}{X_{i}}}}}$log³|X_(i)| which implies that

$\frac{\mathbb{d}^{2}{{\overset{\_}{W}}_{n}^{2}(\theta)}}{\mathbb{d}\theta^{2}} > 0.$We conclude that W _(n)(θ) is a strictly convex function of θ. By thesame argument as before, for every θ ε Θ, W _(n)(θ)→W(θ) in probability,where W(Θ) is given as

$\begin{matrix}{{W(\theta)} = {{U(\theta)} - {\frac{\theta^{2}}{2}E_{0}\log^{3}{X}}}} & (18)\end{matrix}$

Hence

$\sup\limits_{{\theta\varepsilon}\; K}{{{{{\overset{\_}{W}}_{n}(\theta)} - {W(\theta)}}}\overset{P_{o}}{\longrightarrow}0}$In addition, we note that Y and U are continuous functions on thecompact set K and thus are bounded on K. With these facts, we concludethat

$\frac{1}{n}{{\overset{.}{G}}_{n}(\theta)}$uniformly converges to Ġ(θ) in probability on any compact subset K of Θ.Therefore,

$\begin{matrix}{{{\sup\limits_{\theta \in {H_{n}{({\theta_{0},r})}}}{{{\frac{1}{n}{{\overset{.}{G}}_{n}(\theta)}} - {\overset{.}{G}(\theta)}}}} \leq {\sup\limits_{\theta \in K}{{{\frac{1}{n}{{\overset{.}{G}}_{n}(\theta)}} - {\overset{.}{G}(\theta)}}}}}\overset{P_{0}}{\rightarrow}0} & (19)\end{matrix}$as n→∞, where the compact set K can be taken to be, for example, as

$K = {\left\lbrack {{\frac{1}{2}\theta_{0}},{2\theta_{0}}} \right\rbrack.}$The uniform continuity of Ġ on the compact set implies that

$\begin{matrix}\left. {\sup\limits_{\theta \in {H_{n}{({\theta_{0},r})}}}{{{\overset{.}{G}(\theta)} - {\overset{.}{G}\left( \theta_{0} \right)}}}}\rightarrow 0 \right. & (20)\end{matrix}$as n→∞. Combining (19) and (20) together with (16), we conclude that

$\begin{matrix}{{\sup\limits_{\theta \in {H_{n}{({\theta_{0},r})}}}{{{\frac{1}{n}{{\overset{.}{G}}_{n}(\theta)}} - {\overset{.}{G}\left( \theta_{0} \right)}}}}\overset{P_{0}}{\rightarrow}0} & (21)\end{matrix}$

Next we show that the sequence

$\frac{1}{\sqrt{n}}{G_{n}\left( \theta_{0} \right)}$is uniformly tight. It follows from extensive computations that

$\begin{matrix}{{\frac{1}{\sqrt{n}}{G_{n}\left( \theta_{0} \right)}} = {{\sqrt{n}\left\lbrack {{{\overset{\_}{Y}}_{n}\left( {2\theta_{0}} \right)} - {Y\left( {2\theta_{0}} \right)}} \right\rbrack} - {\sqrt{n}{\left( {\theta_{0} + 1} \right)\left\lbrack {{{\overset{\_}{Y}}_{n}\left( \theta_{0} \right)} - {Y\left( \theta_{0} \right)}} \right\rbrack}^{2}} - {2\sqrt{n}\left( {\theta_{0} + 1} \right){{Y\left( \theta_{0} \right)}\left\lbrack {{{\overset{\_}{Y}}_{n}\left( \theta_{0} \right)} - {Y\left( \theta_{0} \right)}} \right\rbrack}} + {\sqrt{n}\left\lbrack {{Y\left( {2\theta_{0}} \right)} - {\left( {\theta_{0} + 1} \right){Y^{2}\left( \theta_{0} \right)}}} \right\rbrack}}} & (22)\end{matrix}$

We observe that Z(θ₀)=0. This implies that Y(2θ₀)−(θ₀+1)Y²(θ₀)=0, sothat the last term of (22) is equal to zero. By the central limittheorem, both √{square root over (n)}[ Y _(n)(θ₀)−Y(θ₀)] and √{squareroot over (n)}[ Y _(n)(2θ₀)−Y(2θ₀)] converge in distribution to thenormal distributions with mean zero and finite variances Var₀|X|^(θ) ⁰and Var₀|X|^(2θ) ⁰ , respectively. Thus, both the first and the thirdterms of (22) O_(p)(1). By the same reasoning,

$\begin{matrix}{{\sqrt{n}\left\lbrack {{{\overset{\_}{Y}}_{n}\left( \theta_{0} \right)} - {Y\left( \theta_{0} \right)}} \right\rbrack}^{2} = {\frac{1}{\sqrt{n}}\left\lbrack {\sqrt{n}\left( {{{\overset{\_}{Y}}_{n}\left( \theta_{0} \right)} - {Y\left( \theta_{0} \right)}} \right)} \right\rbrack}^{2}} \\{= {{o(1)}{O_{p}(1)}}} \\{= {o_{p}(1)}}\end{matrix}$

This finishes the proof that

${\frac{1}{\sqrt{n}}{G_{n}\left( \theta_{0} \right)}} = {{O_{p}(1)}.}$Thus for each ε>0, there exist a positive real number M and an integerN₁ such that P₀(E_(n))>1−ε/2 for all n>N₁, where

$E_{n} = \left\{ {{{\frac{1}{\sqrt{n\;{\overset{.}{G}\left( \theta_{0} \right)}}}{G_{n}\left( \theta_{0} \right)}}} \leq M} \right\}$

For any given δ>0, choose the positive real number r large enough suchthatr ² −Mr−δ>0  (23)

Consider the two end points of the shrinking neighborhood H_(n)(θ₀, r)as defined in (15):

$\theta = {\theta_{0} \pm {\frac{r}{\sqrt{n\;{\overset{.}{G}\left( \theta_{0} \right)}}}.}}$It follows from the Taylor theorem thatG _(n)(θ)=G _(n)(θ₀)+(θ−θ₀)Ġ _(n)(θ_(n)*)where θ_(n)* is between θ and θ₀. Thus

$\begin{matrix}{{{\left( {\theta - \theta_{0}} \right){G_{n}(\theta)}} = {{{r \cdot {sign}}\mspace{11mu}\left( {\theta - \theta_{0}} \right)\frac{1}{\sqrt{n\;{\overset{.}{G}\left( \theta_{0} \right)}}}{G_{n}\left( \theta_{0} \right)}} + r^{2} + {R_{n}\left( \theta_{0} \right)}}}{where}{{R_{n}\left( \theta_{0} \right)} = {\frac{r^{2}}{\overset{.}{G}\left( \theta_{0} \right)}\left\lbrack {{\frac{1}{n}{{\overset{.}{G}}_{n}\left( \theta_{n}^{*} \right)}} - {\overset{.}{G}\left( \theta_{0} \right)}} \right\rbrack}}} & (24)\end{matrix}$

It follows from (21) that

${R_{n}\left( \theta_{0} \right)}\overset{p_{0}}{\rightarrow}0.$Thus for the given ε and δ, we can find a positive integer N₂ such thatfor all n>N₂, we have P₀(F_(n))>1−ε/2, where F_(n)={|R_(n)(θ₀)|≦δ}.Therefore, for all n>N=max{N₁, N₂}, we obtain P₀(E_(n)∩F_(n))>1−ε. Onthe event E_(n)∩F, we have

${{r \cdot {sign}}\mspace{11mu}\left( {\theta - \theta_{0}} \right)\frac{1}{\sqrt{n\;{\overset{.}{G}\left( \theta_{0} \right)}}}{G_{n}\left( \theta_{0} \right)}} \geq {- {Mr}}$and R_(n)(θ₀)≧−δ. Thus, by (23) and (24), we have(θ−θ₀)G _(n)(θ)≧r ² −Mr−δ>0

We conclude that

${G_{n}\left( {\theta_{0} + \frac{r}{\sqrt{n\;{\overset{.}{G}\left( \theta_{0} \right)}}}} \right)} > {0\mspace{14mu}{and}\mspace{14mu}{G_{n}\left( {\theta_{0} - \frac{r}{\sqrt{n\;{\overset{.}{G}\left( \theta_{0} \right)}}}} \right)}} < 0.$Since G_(n) is a continuous function of θ, there exist at least onesolution {circumflex over (θ)}_(n) ε H_(n)(θ₀, r) such thatG_(n){circumflex over (()}{circumflex over (θ)}_(n))=0. Thus for alln>N, P₀({circumflex over (θ)}_(n) ε H_(n)(θ₀, r))≧P₀(E_(n)∩F_(n))>1−ε.This implies that P₀(G_(n)({circumflex over (θ)}_(n))=0)→1 as n→∞ and

${{\hat{\theta}}_{n}\overset{P_{0}}{\rightarrow}\theta_{0}},$as n→∞.

What remains to be shown is the uniqueness of the root {circumflex over(θ)}_(n). We shall show that on the large interval Θ_(L), withprobability tending to 1, the solution of the estimating equationZ_(n)(θ)=0 is unique for all sufficiently large n. To this end, weextend the values of continuous functions Z_(n), Ż_(n), {umlaut over(Z)}_(n) at θ=0 by defining them to be the corresponding right-handlimits,

${Z_{n}(0)} = {{\lim\limits_{\theta\rightarrow 0^{+}}{Z_{n}(\theta)}} = 0}$${{\overset{.}{Z}}_{n}(0)} = {{\lim\limits_{\theta\rightarrow 0^{+}}{{\overset{.}{Z}}_{n}(\theta)}} = {- 1}}$and $\begin{matrix}{{{\overset{..}{Z}}_{n}(0)} = {\lim\limits_{\theta\rightarrow 0^{+}}{{\overset{..}{Z}}_{n}(\theta)}}} \\{= {\frac{2}{n}{\sum\limits_{i = 1}^{n}\left( {{\log{X_{i}}} - {\frac{1}{n}{\sum\limits_{i = 1}^{n}{\log{X_{i}}}}}} \right)^{2}}}}\end{matrix}$It follows from the same arguments as given in the proof of Lemma 1 that

${\sup\limits_{\theta \in K}\mspace{11mu}{{{{{\overset{¨}{Z}}_{n}(\theta)} - {\overset{¨}{Z}(\theta)}}}\overset{P_{0}}{\longrightarrow}0}},$where K=[0, θ_(L)]. Hence, for each ε>0 and δ>0, there exists a positiveinteger N such that for all n>N, P(|{umlaut over (Z)}_(n)(θ)−{umlautover (Z)}(θ)|≦δ for all θ ε K)>1−ε. On the compact set K, {umlaut over(Z)}(θ) is uniformly continuous and achieves both its minimum andmaximum values. Since {umlaut over (Z)}(θ)>0 for all θ ε Θ and by directcalculation {umlaut over (Z)}(0)>0, there exists a real number θ_ε Ksuch that

${\inf\limits_{\theta \in K}\mspace{11mu}{\overset{¨}{Z}(\theta)}} = {{\overset{¨}{Z}({\theta\_})} > 0.}$

Therefore for 0<δ<{umlaut over (Z)}(θ_) and for all n>N, we haveP({umlaut over (Z)}_(n)(θ)>0 for all θ ε K)>1−ε. This implies thatP(C_(n))>1−ε for all n>N, where C_(n)={Z_(n) is a strictly convexfunction of θ ε K}. On the event C_(n), it will be shown that thesolution of the equation Z_(n)(θ)=0 for θ ε Θ_(L) is unique for all n>N.For any ω ε C_(n) and suppose that {tilde over (θ)}_(n) is anothersolution of the equation Z_(n)(θ)=0 such that {tilde over(θ)}_(n)(ω)≠{circumflex over (θ)}_(n)(ω) for some n>N. Then 0<{tildeover (θ)}_(n)(ω)

{circumflex over (θ)}_(n)(ω)<{tilde over (θ)}_(n)(ω)

{circumflex over (θ)}_(n)(ω)<θ_(L), where the notation

and

denote minima and maxima, respectively. Thus there exists a 0<λ_(n)(ω)<1such that{tilde over (θ)}_(n)(ω)

{circumflex over (θ)}_(n)(ω)=λ_(n)(ω)({tilde over (θ)}_(n)(ω)

{circumflex over (θ)}_(n)(ω))

It follows from the strict convexity of Z_(n) on K thatZ _(n)({tilde over (θ)}_(n)(ω)

{circumflex over (θ)}_(n)(ω))<λ_(n)(ω)Z _(n)({tilde over (θ)}_(n)(ω)

{circumflex over (θ)}_(n)(ω))  (25)

In view of the equations Z_(n)({tilde over (θ)}_(n)(ω))=0 andZ_(n)({circumflex over (θ)}_(n)(ω))=0, we haveZ _(n)({tilde over (θ)}_(n)(ω)

{circumflex over (θ)}_(n)(ω))=Z _(n)({tilde over (θ)}_(n)(ω)

{circumflex over (θ)}_(n)(ω))=0

This fact contradicts with the equation (25). We conclude that, withprobability tending to 1, the solution of the equation Z_(n)(θ)=0 isunique on the sufficiently large interval Θ_(L) for all large n. Theproof is now complete.

In practical applications, both the scale and shape parameters areusually estimated. Given the consistent estimator {circumflex over(θ)}_(n) provided by Theorem 3, it would be natural to plug theestimator {circumflex over (θ)}_(n) of θ₀ into the formula for{circumflex over (σ)}₀ given in Section A. Let the resulting scaleestimator be denoted by

$\begin{matrix}{{\hat{\sigma}}_{n}:=\left( {\frac{{\hat{\theta}}_{n}}{n}{\sum\limits_{i = 1}^{n}{X_{i}}^{{\hat{\theta}}_{n}}}} \right)^{\frac{1}{{\hat{\theta}}_{n}}}} & (26)\end{matrix}$Then the following corollary is a direct consequence of Theorem 3.

Corollary 1: The scale estimator {circumflex over (σ)}_(n) given inequation (26) is a consistent estimator of the scale parameter{circumflex over (σ)}₀.

Proof. With the notation introduced in Section B, we can rewrite{circumflex over (σ)}₀ and σ₀ as {circumflex over (σ)}₀=({circumflexover (θ)}_(n) Y _(n)({circumflex over(θ)}_(n)))^(1/{circumflex over (θ)}) ^(n) and σ₀=(θ₀Y(θ₀))^(1/θ) ⁰ ,respectively. It follows from Taylor's theorem that

$\begin{matrix}{{{\overset{\_}{Y}}_{n}\left( {\hat{\theta}}_{n} \right)} = {{{\overset{\_}{Y}}_{n}\left( \theta_{0} \right)} + {\left( {{\hat{\theta}}_{n} - \theta_{0}} \right)\;{{\overset{\_}{U}}_{n}\left( \theta_{0} \right)}} + {\frac{1}{2}\left( {{\hat{\theta}}_{n} - \theta_{0}} \right)^{2}{{\overset{\_}{V}}_{n}\left( {\overset{\sim}{\theta}}_{n} \right)}}}} & (27)\end{matrix}$where {tilde over (θ)}_(n) is between θ₀ and θ_(n). By Theorem 3,{circumflex over (θ)}_(n) is a consistent estimator of θ₀, hence

${{\overset{\sim}{\theta}}_{n}\overset{P_{0}}{\longrightarrow}\theta_{0}}.$This implies that there exists some compact set K containing θ₀ suchthat for sufficiently large n

${{{\overset{\_}{V}}_{n}\left( {\overset{\sim}{\theta}}_{n} \right)}} \leq {{\sup\limits_{\theta \in K}\mspace{11mu}{{{{\overset{\_}{V}}_{n}(\theta)} - {V(\theta)}}}} + {{{V\left( {\overset{\sim}{\theta}}_{n} \right)}}.}}$

Since V _(n) is a sequence of strictly convex functions of θ and

${{\overset{\_}{V}}_{n}(\theta)}\overset{P_{0}}{\longrightarrow}{V(\theta)}$for every θ ε Θ in view of the law of large numbers. we conclude againby the convexity lemma that sup

$\sup_{\theta \in \Theta}{{{{{\overset{\_}{V}}_{n}(\theta)} - {V(\theta)}}}\overset{P_{0}}{\longrightarrow}0.}$This fact coupled with the continuous mapping theorem shows that| V _(n)({circumflex over (θ)}_(n))|≦o _(p)(1)+O _(p)(1)=O _(p)(1)

By the law of large numbers, we have

${{\overset{\_}{Y}}_{n}\left( \theta_{0} \right)}\overset{P_{0}}{\longrightarrow}{Y\left( \theta_{0} \right)}$and

${{{\overset{\_}{U}}_{n}\left( \theta_{0} \right)}\overset{P_{0}}{\longrightarrow}{U\left( \theta_{0} \right)}}.$From these facts and the equation (27), we conclude that

${{{\overset{\_}{Y}}_{n}\left( \theta_{0} \right)}\overset{P_{0}}{\longrightarrow}{Y\left( \theta_{0} \right)}}.$An application of the continuous mapping theorem completes the proof.

iv. Global Convergence of Newton-Raphson Algorithm for the Sample-BasedShape Estimating Equation

To find the unique global root {circumflex over (θ)}_(n) of the equationZ_(n)(θ)=0, we consider the Newton-Raphson functional iterationalgorithm,

$\begin{matrix}{{\hat{\theta}}_{n,{k + 1}} = {{\hat{\theta}}_{n,k} - \frac{Z_{n}\left( {\hat{\theta}}_{n,k} \right)}{{\overset{.}{Z}}_{n}\left( {\hat{\theta}}_{n,k} \right)}}} & (28)\end{matrix}$for k=1, 2, . . . , where {circumflex over (θ)}_(n,1) is the startingvalue for the iteration.

Theorem 4: With probability tending to one, the Newton-Raphsonfunctional iteration {{circumflex over (θ)}_(n,k)}_(k=1) ^(∞) defined bythe algorithm (28) converges to the unique global root {circumflex over(θ)}_(n) from any starting point {circumflex over (θ)}_(n,l) ε Θ_(min).In other words,

$\left. {P_{0}\left( {{\lim\limits_{k\rightarrow\infty}{\hat{\theta}}_{n,k}} = {\hat{\theta}}_{n}} \right)}\rightarrow\left. {1\mspace{14mu}{as}\mspace{14mu} n}\rightarrow{\infty.} \right. \right.$

Proof. It follows from Lemma 1 that {umlaut over (Z)}_(n)(θ) convergesto Z_(n)(θ) in probability uniformly for all θ in any compact set K ⊂ Θ.For any real number ε>0, let A_(n) be the event that |{umlaut over(Z)}_(n)(θ)−{umlaut over (Z)}(θ)|<ε for all θ ε K. Then for any realnumber δ>0, there exists a positive integer N₁ such that for all n>N₁,we have P₀(A_(n))>1−δ. Since {umlaut over (Z)} is a continuous functionof θ and {umlaut over (Z)}(θ)>0 for all θ ε Θ, there exists a realnumber θ_ε K such that

${{\overset{¨}{Z}(\theta)} \geq {\inf\limits_{\theta \in K}{\overset{¨}{Z}(\theta)}}} = {{{\overset{¨}{Z}({\theta\_})} > {0\mspace{14mu}{for}\mspace{14mu}{all}\mspace{14mu}\theta}} \in {K.}}$Hence for any 0<ε<{umlaut over (Z)}(θ_), the event A_(n) implies theevent B_(n)={{umlaut over (Z)}_(n)(θ)>0 for all θ ε K}. Let C_(n) be theevent that Z_(n) is a strictly convex function of θ and Ż_(n) is astrictly increasing function of θ on K. Then the event B_(n) implies theevent C_(n) so that P₀(C_(n))≧P₀(B_(n))>1−δ for all n>N₁. By Theorem

${{\hat{\theta}}_{n}\overset{P_{0}}{\longrightarrow}\theta_{0}},$hence there exists a positive integer N₂ such that for all n>N₂,P₀(D_(n))>1−δ, where the event D_(n)={|{circumflex over (θ)}_(n)−θ₀|<ε}.For any real number L such that θ_(min) <L<θ₀ and for 0<ε<θ₀−L. Theevent D_(n) implies the event E_(n)={L<{circumflex over (θ)}_(n)<2θ₀−L}.Thus for all n>max{N₁, N₂} and 0<ε<min{θ₀−L,{umlaut over (Z)}(θ_)}, weobtain P₀(C_(n)∩E_(n))≧1−2δ. To show the convergence of theNewton-Raphson algorithm from any starting point {circumflex over (θ)}2_(n,1) ε Θ_(min), we will establish first that for any given startingvalue {circumflex over (θ)}_(n,1)=θ₁, if θ₁ ε(θ_(min), {circumflex over(θ)}_(n)), then only one step Newton-Raphson iteration would, withprobability tending to one, move the iterate {circumflex over (θ)}_(n,2)into the interval [{circumflex over (θ)}_(n), ∞). Let M be any arbitrarylarge real number greater than max

$\left\{ {{2\theta_{0}},{\theta_{1} + {2\frac{{Z\left( \theta_{1} \right)}}{\overset{.}{Z}\left( \theta_{1} \right)}}}} \right\}$and choose the real number L such that θ_(min)<L<min(θ₀, θ₁), and takethe compact subset K as K=[L, M]. It follows again from Lemma 1 thatŻ_(n)(θ) converges uniformly to Ż(θ) in probability on the compact setK. Hence there exists a positive integer N₃ such that for all n>N₃, wehave P₀(|Ż_(n)(θ)−Ż(θ)|<ε for all θ ε K)>1−δ. Since Ż is a strictlyincreasing function of θ, we conclude that Ż(L)>0=Ż(θ_(min)). Thus, forall 0<ε<Ż(L) and all n>N₃, we have P₀(F_(n))>1−δ, where the eventF_(n)={Ż_(n)(θ)>0 for all θ ε K}. Let the event G_(n) be defined asG_(n)={Z_(n)({circumflex over (θ)}_(n))=0}. Theorem 3 implies that thereexists a positive integer N₄ such that for all n>N₄, we haveP₀(G_(n))>1−δ. Combining this fact with previous results for the eventsC_(n), E_(n), and F_(n), we conclude thatP₀(C_(n)∩E_(n)∩F_(n)∩G_(n))>1−4δ for all

$n > {\max\limits_{1 \leq i \leq 4}\left\{ N_{i} \right\}}$and 0<ε<min{θ₀−L, {umlaut over (Z)}(θ_), Ż(L)}. On the eventC_(n)∩E_(n)∩F_(n)∩G_(n), it follows from the strict convexity of Z_(n)and the monotonicity of Ż_(n) that

${{{\overset{.}{Z}}_{n}\left( {\hat{\theta}}_{n,1} \right)} < \frac{{Z_{n}\left( {\hat{\theta}}_{n} \right)} - {Z_{n}\left( {\hat{\theta}}_{n,1} \right)}}{{\hat{\theta}}_{n} - {\hat{\theta}}_{n,1}}} = \frac{- {Z_{n}\left( {\hat{\theta}}_{n,1} \right)}}{{\hat{\theta}}_{n} - {\hat{\theta}}_{n,1}}$and Ż_(n)({circumflex over (θ)}_(n,1))>0. These are equivalent to

${{{\hat{\theta}}_{n,1} - \frac{Z_{n}\left( {\hat{\theta}}_{n,1} \right)}{{\overset{.}{Z}}_{n}\left( \theta_{n,1} \right)}} > \theta_{n}},$that is, θ_(n,2)>{circumflex over (θ)}_(n). It follows from the uniformconvergence of Z_(n) and Ż_(n) on K to Z and Ż in probability,respectively and the uniform continuity of Z and Ż on the compact subsetK that there exists a positive integer N₅ such that for all n>N₅,

${P_{0}\left( {{\sup\limits_{\theta \in K}{{\frac{Z_{n}(\theta)}{{\overset{.}{Z}}_{n}(\theta)} - \frac{Z(\theta)}{\overset{.}{Z}(\theta)}}}} < \varepsilon} \right)} > {1 - \delta}$Consequently, P₀(H_(n))>1−δ where

$H_{n} = \left\{ {{{- \frac{Z\left( \theta_{1} \right)}{\overset{.}{Z}\left( \theta_{1} \right)}} - \varepsilon} < \frac{Z_{n}\left( \theta_{1} \right)}{{\overset{.}{Z}}_{n}\left( \theta_{1} \right)} < {\frac{{Z\left( \theta_{1} \right)}}{\overset{.}{Z}\left( \theta_{1} \right)} + \varepsilon}} \right\}$Thus for all

${n > {\max\limits_{1 \leq i \leq 5}{\left\{ N_{i} \right\}\mspace{14mu}{and}\mspace{14mu} 0}} < ɛ < {\min\left\{ {{\theta_{0} - L},{\overset{..}{Z}({\theta\_})},{\overset{.}{Z}(L)},\frac{{Z\left( \theta_{1} \right)}}{\overset{.}{Z}\left( \theta_{1} \right)}} \right\}}},$we have P₀(C_(n)∩E_(n)∩F_(n)∩G_(n)∩H_(n))>1−5δ, which implies thatP₀({circumflex over (θ)}_(n)≦{circumflex over (θ)}_(n,2)<M)>1−5δ.

Therefore it is enough for us to show that, with probability tending toone, the Newton-Raphson iteration {{circumflex over (θ)}_(n,k)}_(k=1)^(∞) converges to {circumflex over (θ)}_(n) as k→∞ from any startingpoint {circumflex over (θ)}_(n,1)ε[{circumflex over (θ)}_(n), ∞). In thefollowing, we will use the same notations and definitions of events asgiven above. Let the function g_(n) be defined by

${{g_{n}(\theta)} = {{\theta - {\frac{Z_{n}(\theta)}{{\overset{.}{Z}}_{n}(\theta)}\mspace{14mu}{for}\mspace{14mu}\theta}} \in I_{n}}},$where I_(n) denotes the interval [{circumflex over (θ)}_(n), M). Weconsider the equation θ=g_(n)(θ) on the interval I_(n) and it is clearthat g_(n) is a continuous function of θ. We observe that forsufficiently large n and for 0<ε<min{θ₀−L,{umlaut over (Z)}(θ_),Ż(L)},on the event B_(n)∩E_(n)F_(n)∩G_(n), we have g_(n)({circumflex over(θ)}_(n))={circumflex over (θ)}_(n) in view of the equationZ_(n)({circumflex over (θ)}_(n))=0 so that {circumflex over (θ)}_(n) isa fixed point of g_(n). Since {circumflex over (θ)}_(n) is the uniqueroot of Z_(n) by Theorem 3, we conclude that {circumflex over (θ)}_(n)is the only fixed point of g_(n) on I_(n). To show that the functionaliterations converge, we compute the derivative of g_(n)(θ) with respectto θ and obtain

${{\overset{.}{g}}_{n}(\theta)} = \frac{{Z_{n}(\theta)}{{\overset{..}{Z}}_{n}(\theta)}}{\left\lbrack {Z_{n}(\theta)} \right\rbrack^{2}}$

For simplicity of presentation, all the following arguments are made onthe event B_(n)∩E_(n)∩F_(n)∩G_(n) for sufficiently large n and for0<ε<min{θ₀−L,{umlaut over (Z)}(θ_),Ż(L)} unless otherwise specified.Since {umlaut over (Z)}_(n)(θ)>0 and Z_(n)(θ)≧0 for all θ ε I_(n), weconclude that ġ_(n)(θ)≧0 for all θ ε I_(n) (in fact, ġ_(n)(θ)>0 for allθ ε I_(n)\{{circumflex over (θ)}_(n)}). Thus g_(n) is an increasingfunction of θ, which implies that {circumflex over(θ)}_(n)=g_(n)({circumflex over (θ)}_(n))≦g_(n)(θ)<M for all θ ε I_(n).That is, the function g_(n) maps the interval I_(n) into itself. Itfollows from Ż_(n)(θ)>0 and Z_(n)(0)≧0 for all θ ε I_(n) that

${\hat{\theta}}_{n,{k + 1}} = {{g_{n}\left( {\hat{\theta}}_{n,k} \right)} = {{{\hat{\theta}}_{n,k} - \frac{Z_{n}\left( {\hat{\theta}}_{n,k} \right)}{{\overset{.}{Z}}_{n}\left( {\hat{\theta}}_{n,k} \right)}} \leq {\hat{\theta}}_{n,k}}}$for {circumflex over (θ)}_(n,k) ε I_(n). This fact coupled with theobservation that {circumflex over (θ)}_(n,k+1)=g_(n)({circumflex over(θ)}_(n,k))≧g_(n)({circumflex over (θ)}_(n))={circumflex over (θ)}_(n)for {circumflex over (θ)}_(n,k) ε I_(n) implies that the sequence{{circumflex over (θ)}_(n,k)}_(k=1) ^(∞) is a monotonically decreasingsequence bounded below. Hence, the limit

$\lim\limits_{k->\infty}{\hat{\theta}}_{n,k}$exists in I_(n) and this limit is denoted by {circumflex over(θ)}_(n,∞). Invoking the continuity of g_(n) shows that {circumflex over(θ)}_(n,∞)=g_(n)({circumflex over (θ)}_(n,∞)). That is, {circumflex over(θ)}_(n,∞) is a fixed point of g_(n) in I_(n). By the uniqueness of thefixed point of g_(n) on I_(n), we conclude that {circumflex over(θ)}_(n,∞)={circumflex over (θ)}_(n). Since for

${n > {\max\limits_{1 \leq i \leq 4}\left\{ N_{i} \right\}}},{{P_{0}\left( {B_{n}\bigcap E_{n}\bigcap F_{n}\bigcap G_{n}} \right)} > {1 - {4\;{\delta.}}}}$We conclude that

${P_{0}\left( {{\lim\limits_{k->\infty}{\hat{\theta}}_{n,k}} = {\hat{\theta}}_{n}} \right)} > {1 - {4{\delta.}}}$

Some numerical examples are provided in FIG. 9 to demonstrate the globalconvergence of Newton-Raphson algorithm for the sample-based shapeestimating equation. From FIG. 9, we see that starting from any point inthe semi-infinite interval Θ_(min), no matter how far away the startingvalue θ₁ is from the true parameter θ₀, with probability tending to 1,Newton-Raphson iterations {circumflex over (θ)}_(n,k) converge to{circumflex over (θ)}_(n) as k→∞. More specifically, in FIG. 9, globalconvergence of the Newton-Raphson algorithm is demonstrated by plots ofthe trajectory of the functional iterations with different startingvalues. The solid lines represent the shape estimates {circumflex over(θ)}_(n,k) plotted as functions of the iteration number k. The dottedline denotes the true parameter θ_(n). The sample shape function Z_(n)and its derivative Ż_(n) are based on a single random sample of sizen=500 generated from the GGD with scale parameter σ₀=1 and with the trueshape parameter θ₀ corresponding to four different values: (a) θ₀=2, (b)θ₀=1.5, (c) θ₀=1, and (d) θ₀=0.5.

Many modifications and other embodiments of the inventions set forthherein will come to mind to one skilled in the art to which theseinventions pertain having the benefit of the teachings presented in theforegoing descriptions and the associated drawings. Therefore, it is tobe understood that the inventions are not to be limited to the specificembodiments disclosed and that modifications and other embodiments areintended to be included within the scope of the appended claims.Although specific terms are employed herein, they are used in a genericand descriptive sense only and not for purposes of limitation.

1. A computer-implemented method for performing digital signalprocessing, comprising: obtaining a first set of digitized coefficientsfrom source data; determining a best-fit distribution of a generalizedGaussian distribution for the set of digitized coefficients, whereindetermining the best-fit distribution includes determining parametersfor the generalized Gaussian distribution, wherein determining theparameters includes determining an actual shape parameter of thegeneralized Gaussian distribution by: (i) applying a first algorithm tothe digitized coefficients, wherein the first algorithm depends upon aratio of a first summation of the digitized coefficients to a secondsummation of the digitized coefficients and wherein the first and secondsummations depend, at least in part, upon a variable shape parameter,and (ii) determining, via one or more iterations, a root value for thevariable shape parameter of the first algorithm, wherein the iterationsare globally-convergent to the root value irrespective of a startingvalue of the variable shape parameter and wherein the actual shapeparameter comprises the root value; wherein the first algorithmcomprises${\frac{\frac{1}{n}{\sum\limits_{i = 1}^{n}{X_{i}}^{2\;\theta}}}{\left( {\frac{1}{n}{\sum\limits_{i = 1}^{n}{X_{i}}^{\theta}}} \right)^{2}} - \left( {\theta + 1} \right)},$where n in the first algorithm is associated with the number ofdigitized coefficient X_(i) in the first algorithm is associated with avalue of each of the digitized coefficients, and θ is associated withthe variable shape parameter; applying a quantization algorithm to thefirst set of digitized coefficients to obtain a second set ofquantizers, wherein the quantization algorithm is based at least in parton the determined best-fit distribution; and providing the second set ofquantizers as a compressed representation of the source data, whereinthe prior steps are performed by one or more computers.
 2. Thecomputer-implemented method of claim 1, wherein the iterations compriseone of Newton-Raphson iterations or Halley's method iterations.
 3. Thecomputer-implemented method of claim 1, wherein determining parametersfurther includes determining a scale parameter of the generalizedGaussian distribution from the value of the actual shape parameter. 4.The computer-implemented method of claim 1, wherein the source dataincludes an analog signal and obtaining the first set of digitizedcoefficients comprises applying a transform to the analog signal togenerate the first set of digitized coefficients.
 5. Thecomputer-implemented method of claim 4, wherein the transform comprisesa time-frequency transform, wherein the time-frequency transformincludes one of a Fast Fourier Transform, a Wavelet Transform, or aDiscrete Cosine Transform.
 6. The computer-implemented method of claim1, wherein the quantization algorithm comprises a Lloyd-Max algorithm.7. A system for performing digital signal processing, comprising: amemory for storing executable instructions; a processor in communicationwith the memory, wherein the process is operable to execute the storedinstructions to: obtain a first set of digitized coefficients fromsource data; determine a best-fit distribution of a generalized Gaussiandistribution for the set of digitized coefficients, wherein the best-fitdistribution is determined at least in part by determining an actualshape parameter of the generalized Gaussian distribution by: (i)applying a first algorithm to the digitized coefficients, wherein thefirst algorithm depends upon a ratio of a first summation of thedigitized coefficients to a second summation of the digitizedcoefficients and wherein the first and second summations depend, atleast in part, upon a variable shape parameter, and (ii) determining,via one or more iterations, a root value for the variable shapeparameter of the first algorithm, wherein the iterations areglobally-convergent to the root value irrespective of a starting valueof the variable shape parameter and wherein the actual shape parametercomprises the root value; wherein the first algorithm comprises${\frac{\frac{1}{n}{\sum\limits_{i = 1}^{n}{X_{i}}^{2\;\theta}}}{\left( {\frac{1}{n}{\sum\limits_{i = 1}^{n}{X_{i}}^{\theta}}} \right)^{2}} - \left( {\theta + 1} \right)},$where n in the first algorithm is associated with the number ofdigitized coefficient, X_(i) in the first algorithm is associated with avalue of each of the digitized coefficients, and θ is associated withthe variable shape parameter; apply a quantization algorithm to thefirst set of digitized coefficients to obtain a second set ofquantizers, wherein the quantization algorithm is based at least in parton the determined best-fit distribution; and provide the second set ofquantizers as a compressed representation of the source data.
 8. Thesystem of claim 7, wherein the iterations comprise one of Newton-Raphsoniterations or Halley's method iterations.
 9. The system of claim 7,wherein the quantization algorithm comprises a Lloyd-Max algorithm. 10.The system of claim 7, wherein the source data includes an analog signaland obtaining the first set of digitized coefficients comprises applyinga transform to the analog signal to generate the first set of digitizedcoefficients.
 11. A computer-implemented method for performing digitalsignal processing, comprising: retrieving suspected encoded data;determining at least one parameter of a generalized Gaussiandistribution for the suspected encoded data, wherein determining atleast one parameter includes determining an actual shape parameter ofthe generalized Gaussian distribution by: (i) applying a first algorithmto digitized coefficients of the suspected encoded data, wherein thefirst algorithm depends upon a ratio of a first summation of thedigitized coefficients to a second summation of the digitizedcoefficients and wherein the first and second summations depend, atleast in part, upon a variable shape parameter, and (ii) determining,via one or more iterations, a root value for the variable shapeparameter of the first algorithm, wherein the iterations areglobally-convergent to the root value irrespective of a starting valueof the variable shape parameter and wherein the actual shape parametercomprises the root value; wherein the first algorithm comprises${\frac{\frac{1}{n}{\sum\limits_{i = 1}^{n}{X_{i}}^{2\theta}}}{\left( {\frac{1}{n}{\sum\limits_{i = 1}^{n}{X_{i}}^{\theta}}} \right)^{2}} - \left( {\theta + 1} \right)},$where n in the first algorithm is associated with the number ofdigitized coefficient X_(i) in the first algorithm is associated with avalue of each of the digitized coefficients, and θ is associated withthe variable shape parameter; determining a digital watermark within thesuspected encoded data based at least in part on at least one thedetermined parameter; and extracting the digital watermark from thesuspected encoded data, wherein the prior steps are performed by one ormore computers.
 12. The computer-implemented method of claim 11, whereinthe iterations comprise one of Newton-Raphson iterations or Halley'smethod iterations.
 13. The computer-implemented method of claim 11,wherein the digital watermark includes a plurality of bits associatedwith an origin of the encoded data.